Protein expression (healthy tissue): PaxDb protein abundance

Portfolio targets

Author

Target Sciences

Published

13 February 2026

Department: Therapeutics (Target Sciences)

Department: Therapeutics (Target Sciences)
Code
knitr::read_chunk("01b-frontMatter.R")
Code
knitr::opts_chunk$set(
  fig.width = 12, 
  fig.height = 8, 
  fig.path = "markdown_figs/", 
  dev = "png", 
  eval = TRUE, 
  echo = FALSE, 
  warning = FALSE, 
  message = FALSE, 
  tidy = FALSE
)
## This switch allows for document-type dependent output (e.g. interactive graphs) https://trinkerrstuff.wordpress.com/2014/11/18/rmarkdown-alter-action-depending-on-document/
document_output_type <- knitr::opts_knit$get("rmarkdown.pandoc.to")
# print(document_output_type)

1 Aim

To identify tissues at risk of on-target off-tumour toxicity.

2 Results

To identify tissues at risk of on-target off-tumour toxicity, we retrieved quantitative protein expression data from the PaxDb database.

These were first filtered to the target tissues: Lung, Heart, Kidney and Skin. These were subsequently filtered to the current set of Bicycle targets (“Bicycle_Merged_Target_List_Reserved_targets_06_02_26_FINAL.xlsx”).

3 Methods

3.1 Datahub: PaxDb

Human protein abundance data was manually retrieved from: PaxDb

  • https://pax-db.org/downloads/6.0/datasets/9606.zip

All text is reproduced from the reference articles [1]2:

The PaxDb database (Protein Abundances Across Organisms) is an integrative metaresource dedicated to absolute protein abundance levels in whole organism or tissue-specific proteomes (8, 9). PaxDb focuses on creating a consensus view on normal/healthy proteomes and expresses abundance values in “parts per million” (ppm) in relation to all other protein molecules in the sample. Since the last PaxDb update, the proteomics community has grown continuously: roughly 1000 projects per month are submitted to ProteomeXchange, the largest centralized platform for MS-derived primary data submission (10), involving PeptideAtlas (11), PRIDE (12), iProX (13), and jPOST (14) among others. For the latest version 5.0 of PaxDb, we have further improved data integration by extending the types of raw data imported from the various repositories and by expanding the number of organisms and tissue groups as well as the proteome depth of previously covered organisms.

Protein abundance datasets in PaxDb are re-scaled to a common abundance metric (‘parts per million’), and also ranked via a universally applicable, albeit somewhat indirect quality score. For the re-scaling, the datasets are first parsed or processed such that the data reflect proportional abundances of whole protein molecules (i.e. proportionality to counts of complete, individual protein molecules, not to molecular weights, protein volumes, or digested peptides). In the case of spectral counting data, this is done via an in-house pipeline that takes into account protein sizes and estimated relative detectabilities of peptides. For other datasets, the procedures depend on the type of data and the type of quantitative information that is provided (datasets which cannot be converted to proportional abundances of entire protein molecules are discarded). Then, the proportional abundances are re-scaled linearly to add up to one million; this means the abundance of each protein of interest is finally expressed in ‘parts per million’, relative to all other proteins in a sample. While this metric cannot be directly converted to ‘molecules per cell’, it has the advantage of being comparable/meaningful across cells of different volumes, or across tissues of different cellular and extracellular compositions.

4 R session details

Analysis was performed using R (ver. 4.5.1) and the following additional packages:

Packages used (continued below)
  Package Version
ggplot2 ggplot2 4.0.0
RColorBrewer RColorBrewer 1.1-3
sysfonts sysfonts 0.8.9
  Author
ggplot2 Hadley Wickham [aut] (ORCID: https://orcid.org/0000-0003-4757-117X), Winston Chang [aut] (ORCID: https://orcid.org/0000-0002-1576-2126), Lionel Henry [aut], Thomas Lin Pedersen [aut, cre] (ORCID: https://orcid.org/0000-0002-5147-4711), Kohske Takahashi [aut], Claus Wilke [aut] (ORCID: https://orcid.org/0000-0002-7470-9261), Kara Woo [aut] (ORCID: https://orcid.org/0000-0002-5125-4188), Hiroaki Yutani [aut] (ORCID: https://orcid.org/0000-0002-3385-7233), Dewey Dunnington [aut] (ORCID: https://orcid.org/0000-0002-9415-4582), Teun van den Brand [aut] (ORCID: https://orcid.org/0000-0002-9335-7468), Posit, PBC [cph, fnd] (ROR: https://ror.org/03wc8by49)
RColorBrewer Erich Neuwirth [aut, cre]
sysfonts Yixuan Qiu and authors/contributors of the included fonts. See file AUTHORS for details.
R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United Kingdom.utf8 
[2] LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_4.0.0      sysfonts_0.8.9     RColorBrewer_1.1-3

loaded via a namespace (and not attached):
 [1] sass_0.4.10       generics_0.1.4    digest_0.6.37     magrittr_2.0.4   
 [5] evaluate_1.0.5    grid_4.5.1        showtextdb_3.0    fastmap_1.2.0    
 [9] jsonlite_2.0.0    processx_3.8.6    backports_1.5.0   secretbase_1.0.5 
[13] ps_1.9.1          pander_0.6.6      crosstalk_1.2.2   scales_1.4.0     
[17] codetools_0.2-20  jquerylib_0.1.4   cli_3.6.5         rlang_1.1.6      
[21] withr_3.0.2       cachem_1.1.0      yaml_2.3.10       tools_4.5.1      
[25] dplyr_1.1.4       base64url_1.4     DT_0.34.0         showtext_0.9-7   
[29] curl_7.0.0        vctrs_0.6.5       R6_2.6.1          lifecycle_1.0.4  
[33] htmlwidgets_1.6.4 targets_1.11.4    pkgconfig_2.0.3   callr_3.7.6      
[37] pillar_1.11.1     bslib_0.10.0      gtable_0.3.6      glue_1.8.0       
[41] data.table_1.17.8 Rcpp_1.1.0        xfun_0.54         tibble_3.3.0     
[45] tidyselect_1.2.1  rstudioapi_0.17.1 knitr_1.50        farver_2.1.2     
[49] htmltools_0.5.8.1 igraph_2.2.1      rmarkdown_2.30    compiler_4.5.1   
[53] prettyunits_1.2.0 S7_0.2.0         

5 References

1.
Huang, Q., Szklarczyk, D., Wang, M., Simonovic, M. & Mering, C. von. PaxDb 5.0: Curated protein quantification data suggests adaptive proteome changes in yeasts. Molecular & cellular proteomics : MCP 22, 100640 (2023).
2.
Huang, Q., Szklarczyk, D., Oehninger, J. & Mering, C. von. PaxDb v6.0: Reprocessed, LLM-selected, curated protein abundance data across organisms. Nucleic acids research 54, D427–D439 (2026).